Comparison between two models of language for the automatic phonetic labeling of an undocumented language of the South-Asia: the case of Mo Piu
نویسندگان
چکیده
This paper aims at assessing the automatic labeling of an undocumented, unknown, unwritten and under-resourced language (Mo Piu) of the North Vietnam, by an expert phonetician. In the previous stage of the work, 7 sets of languages were chosen among Mandarin, Vietnamese, Khmer, English, French, to compete in order to select the best models of languages to be used for the phonetic labeling of Mo Piu isolated words. Two sets of languages (1° Mandarin + French, 2° Vietnamese + French) which got the best scores showed an additional distribution of their results. Our aim is now to study this distribution more precisely and more extensively, in order to statistically select the best models of languages and among them, the best sets of phonetic units which minimize the wrong phonetic automatic labeling.
منابع مشابه
Towards the Mo Piu Tonal System: First Results on an Undocumented South-Asian Language
This paper presents the first results on the Mo Piu tonal system. This language is undocumented, unwritten and moreover in great danger. As the tasks of labeling phonetics and tones is hard to carry out when references on the language are lacking, this paper aims at presenting our method to try to build reliable data in order to understand the tonal system, and the main findings concerning the ...
متن کاملTowards the tonal system of an unknown language from south-east Asia: a deeper insight
This paper is focused on melodic and tonal analyses of a language without a writing system, the Mo Piu one, from an endangered ethnic minority of the south-east Asia in North Vietnam. The Mo Piu language is a branch still unknown of the Hmong-Mien family. Based on a previous experience, we try to get a deeper insight into the tonal system of this language, getting support with a dedicated tool,...
متن کاملمقایسه روش های طیفی برای شناسایی زبان گفتاری
Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...
متن کاملThe Concept of Educational Culture Language: A Case of Labeling
The verbal interaction among students, teachers, parents, and school administrators plays a significant role in achieving educational goals as the language used in educational settings while mirroring the dominant educational culture, functions as an important tool in shaping and reshaping their beliefs. Thus, the culture language in any educational setting needs to be studied in order to ident...
متن کاملThe Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کامل